<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<HTML>
<HEAD>
<META http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<STYLE TYPE="text/css">
		BODY {
			background-color: f5f5f5;
			font-family: arial, verdana; font-size: small;
		}
		H2, H3, A, .tfc {
			color: #000080;
			font-family: arial, verdana; font-size: small;
		}
		PRE { 
		    font-family: monospace;
			border: 1px lightgrey dotted;
		    white-space: pre; 
		    color: black;
		    background-color: #efefef; 
		}
		TABLE {
			float: right;
			border-collapse: collapse;
			border-top: 1px solid #000000;
			border-right: 1px solid #000000;
			border-left: 1px solid #000000;
		}
		TH {
			padding-top: 2px;
			padding-bottom: 2px;
			padding-right: 5px;
			padding-left: 5px;
		}
		TD {
			padding-top: 2px;
			padding-bottom: 2px;
			padding-right: 5px;
			padding-left: 5px;
			border-bottom: 1px solid #000000;
			border-right: 1px solid #000000;
			font-family: arial, verdana; font-size: small;
		}
	</STYLE>
<TITLE>Csv</TITLE>
</HEAD>
<BODY>
<H1>4. Csv</H1>
The <I>csv</I>(3m) module parses the popular comma separated values (CSV) format exported by applications like spreadsheets. This parser considers quoted elements, quoted quotes, and preserves carriage returns and newlines within elements. The separator character is specified by the user.

<P>
<B CLASS="tfc">Example 1. Reading /etc/passwd</B>
<BR>
The following example uses <I>csv</I>(3m) to parse the <tt>/etc/passwd</tt> file and print the username and uid of each user.
<PRE>

  #include &lt;stdlib.h&gt;
  #include &lt;stdio.h&gt;
  #include &lt;mba/csv.h&gt;
  
  int
  main(void)
  {
      FILE *in;
      unsigned char buf[1024];
      unsigned char *row[10];
      int n;
  
      in = fopen("/etc/passwd", "r");
  
      while ((n = csv_row_fread(in, buf, 1024, row, 10, ':', CSV_TRIM)) &gt; 0) {
          printf("%s %s\n", row[0], row[2]);
      }
      fclose(in);
  
      return EXIT_SUCCESS;
  }
  </PRE>
</P>
<p>
</p>
Please note that escaping a quote requires that the entire element be quoted as well. So a quoted string element would actually look like <tt>"""foo"""</tt> where the outer quotes quote the element and the remaining quote pairs are each an escape followed by the literal quote. This is consistent with how popular spreadsheets behave and deviating from this behavior will generate an error.
<H3>4.1. Csv functions</H3>
<A name="row_parse"></A>
<P>
</P>
<B CLASS="tfc">The <TT>csv_row_parse</TT> function</B>
<BR>
<B>Synopsis</B>
<PRE>
<BR>  #include &lt;mba/csv.h&gt;
  int csv_row_parse(const tchar *src,
           size_t sn,
           tchar *buf,
           size_t bn,
           tchar *row[],
           int rn,
           int sep,
           int flags)<BR>
</PRE>
<B>Description</B>
<BR>
The <TT>csv_row_parse</TT> function will parse a line of text at <tt>src</tt> for no more than <tt>sn</tt> bytes and place pointers to zero terminiated strings allocated from no more than <tt>bn</tt> bytes of the memory at <tt>buf</tt> into the array <tt>row</tt> for at most <tt>rn</tt> data elements. The <tt>sep</tt> parameter must specify a separator to use (e.g. a comma <tt>','</tt>) and must not be a quote, carriage return, or newline. Comma <tt>','</tt>, tab <tt>'\t'</tt>, colon <tt>':'</tt>, and pipe <tt>'|'</tt> are common separators.
<p>
</p>
The <tt>flags</tt> parameter can be zero or any combination of <tt>CSV_TRIM</tt> and <tt>CSV_QUOTES</tt>. If <tt>CSV_TRIM</tt> is specified, strings will be trimmed of leading and trailing whitespace (but an unquoted carriage-return before a newline is always trimmed regardless). If the <tt>CSV_QUOTES</tt> flag is spcecified, quotes will be interpreted. Both flags should be specified when parsing conventional CSV files.
<p>
</p>
The <TT>csv_row_parse</TT> function is actually a macro for either <TT>csv_row_parse_str</TT> or <TT>csv_row_parse_wcs</TT>. The <TT>csv_row_parse_wcs</TT> function has the same prototype but accepts <tt>wchar_t</tt> parameters whereas <TT>csv_row_parse_str</TT> accepts <tt>unsigned char</tt> parameters.
	<BR>
<B>Returns</B>
<BR>The <tt>csv_row_parse</tt> function returns the number of bytes of <tt>src</tt> parsed or -1 if an error occured in which case <tt>errno</tt> will be set appropriately.<P>
</P>
<A name="row_fread"></A>
<P>
</P>
<B CLASS="tfc">The <TT>csv_row_fread</TT> function</B>
<BR>
<B>Synopsis</B>
<PRE>
<BR>  #include &lt;mba/csv.h&gt;
  int csv_row_fread(FILE *in,
           unsigned char *buf,
           size_t bn,
           unsigned char *row[],
           int numcols,
           int sep,
           int flags)<BR>
</PRE>
<B>Description</B>
<BR>
Read a line of text from the stream <tt>in</tt>, process the line with <TT>csv_row_parse</TT> and place pointers to zero terminiated strings allocated from no more than <tt>bn</tt> bytes of the memory at <tt>buf</tt> into the array <tt>row</tt> for at most <tt>rn</tt> data elements.
	<BR>
<B>Returns</B>
<BR>The <tt>csv_row_fread</tt> function returns the number of bytes read from the stream <tt>in</tt> or -1 if an error occured in which case <tt>errno</tt> will be set appropriately.<P>
</P>
<HR NOSHADE>
<small>
Copyright 2003 Michael B. Allen &lt;mba2000 ioplex.com&gt;
</small>
</BODY>
</HTML>
